Search CORE

5 research outputs found

Geodesic Convexity of the Symmetric Eigenvalue Problem and Convergence of Riemannian Steepest Descent

Author: Alimisis Foivos
Vandereycken Bart
Publication venue
Publication date: 07/09/2022
Field of study

We study the convergence of the Riemannian steepest descent algorithm on the Grassmann manifold for minimizing the block version of the Rayleigh quotient of a symmetric and positive semi-definite matrix. Even though this problem is non-convex in the Euclidean sense and only very locally convex in the Riemannian sense, we discover a structure for this problem that is similar to geodesic strong convexity, namely, weak-strong convexity. This allows us to apply similar arguments from convex optimization when studying the convergence of the steepest descent algorithm but with initialization conditions that do not depend on the eigengap

\delta

. When

\delta>0

, we prove exponential convergence rates, while otherwise the convergence is algebraic. Additionally, we prove that this problem is geodesically convex in a neighbourhood of the global minimizer of radius

O(\sqrt{\delta})

arXiv.org e-Print Archive

Gradient-type subspace iteration methods for the symmetric eigenvalue problem

Author: Alimisis Foivos
Saad Yousef
Vandereycken Bart
Publication venue
Publication date: 17/06/2023
Field of study

This paper explores variants of the subspace iteration algorithm for computing approximate invariant subspaces. The standard subspace iteration approach is revisited and new variants that exploit gradient-type techniques combined with a Grassmann manifold viewpoint are developed. A gradient method as well as a conjugate gradient technique are described. Convergence of the gradient-based algorithm is analyzed and a few numerical experiments are reported, indicating that the proposed algorithms are sometimes superior to a standard Chebyshev-based subspace iteration when compared in terms of number of matrix vector products, but do not require estimating optimal parameters. An important contribution of this paper to achieve this good performance is the accurate and efficient implementation of an exact line search. In addition, new convergence proofs are presented for the non-accelerated gradient method that includes a locally exponential convergence if started in a

\mathcal{O(\sqrt{\delta})}

neighbourhood of the dominant subspace with spectral gap

\delta

.Comment: 29 page

arXiv.org e-Print Archive

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Author: Alimisis Foivos
Alistarh Dan
Davies Peter
Publication venue
Publication date: 01/01/2021
Field of study

We investigate fast and communication-efficient algorithms for the classic problem of minimizing a sum of strongly convex and smooth functions that are distributed among

n

different nodes, which can communicate using a limited number of bits. Most previous communication-efficient approaches for this problem are limited to first-order optimization, and therefore have \emph{linear} dependence on the condition number in their communication complexity. We show that this dependence is not inherent: communication-efficient methods can in fact have sublinear dependence on the condition number. For this, we design and analyze the first communication-efficient distributed variants of preconditioned gradient descent for Generalized Linear Models, and for Newton's method. Our results rely on a new technique for quantizing both the preconditioner and the descent direction at each step of the algorithms, while controlling their convergence rate. We also validate our findings experimentally, showing fast convergence and reduced communication

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

Distributed principal component analysis with limited communication

Author: Alimisis Foivos
Alistarh Dan-Adrian
Davies Peter
Vandereycken Bart
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2021
Field of study

We study efficient distributed algorithms for the fundamental problem of principal component analysis and leading eigenvector computation on the sphere, when the data are randomly distributed among a set of computational nodes. We propose a new quantized variant of Riemannian gradient descent to solve this problem, and prove that the algorithm converges with high probability under a set of necessary spherical-convexity properties. We give bounds on the number of bits transmitted by the algorithm under common initialization schemes, and investigate the dependency on the problem dimension in each case

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)